In this section we'll cover two main areas. First, we will
look at troubleshooting general IP connectivity related to routing
protocol issues. Second, we'll examine basic troubleshooting for
popular WAN protocols. In both of these sections we will consider
troubleshooting lost connectivity and poor performance.
Troubleshooting IP Connectivity
The usual way to determine if a device is reachable via the
IP protocol on an internetwork is by using the ping utility. Ping
sends an ICMP packet from the source to the specified destination.
If successful, the returned ping packet proves that all Physical,
Data Link, and Network layer functions are operating correctly from
the source to the destination. For the purposes of this discussion,
failed IP connectivity will mean that a ping packet does not get a
reply. It will be assumed that all physical connections are in place
and all interfaces and line protocols are up along the path from
source to destination. Initially, the discussion focuses on distance
vector protocols; a section on OSPF follows.
Troubleshooting Distance Vector
Protocols. If a ping is failing in an internetwork with good
Physical and Data Link connections, the first place to look are the
routing tables of all routing devices between the source and
destination. Each one in turn should have an entry that states the
next hop for the ultimate destination network number. For example,
if you are trying to ping a device with an IP address of 164.7.8.33
and the netmask in use on your internetwork is 255.255.255.224, then
you would expect to see an entry for the subnet 164.7.8.32 in all
the routing tables between the ping source and its
destination.
This simple statement, although neat, might not be always
true. Suppose the device from which you are sending the ping is on
the 170.5.0.0 network and, as routes are summarized at the boundary
between major network numbers, each of the devices on this network
will have only one entry in its routing table for the 164.7.0.0
network. Once the path from source to destination takes you from the
170.5.0.0 network into the 164.7.0.0 network, you expect to see
entries for the 164.7.32.0 subnet. In addition, you expect to see
entries in the routing table of all the devices from the destination
to the source, for the source's subnet number, so that the ping
reply can find its way back to the source.
So what do we do if one of the devices is missing a routing
table entry? The first thing is to check that each device has a
routing protocol appropriately enabled. This means that all devices
need a routing protocol enabled for the same autonomous system
number, and network entries need to be configured for each network
that is directly attached to a router.
By viewing the configuration of all the routing devices from
source to destination, you confirm that all devices are in fact
correctly configured for a routing protocol. What next? You probably
want to use some debug commands, starting with debug ip igrp events, which tells you which hosts sent IGRP
information and to which hosts IGRP information was sent. The
debug ip igrp
transactions command
details the content of the IGRP updates received and sent in terms
of the network numbers and their associated metrics. In a network
using RIP, debug ip rip
tells you the content of routing updates and which routers they are
sent to and received from. Figure 8-12 provides sample outputs from
these commands.
LOG OUTPUT OF DEBUG IP IGRP EVENTS COMMAND
IGRP: received update from invalid source 193.1.1.1 on
Ethernet0
The first line is interesting as it identifies an IGRP update
that appeared on this interface from a source on a different network
number and is therefore not accepted on this interface. The rest of
the display identifies valid updates received and sent via
broadcast.
LOG OUTPUT OF DEBUG IP IGRP TRANSACTIONS
COMMAND
IGRP: sending update to 255.255.255.255 via Ethernet0
(164.7.1.66)
subnet 164.7.1.96, metric=8476
network 193.1.1.0, metric=8576
IGRP: sending update to 255.255.255.255 via Serial0
(164.7.1.97)
subnet 164.7.1.64, metric=1100
IGRP: received update from invalid source 193.1.1.1 on
Ethernet0
IGRP: received update from 164.7.1.98 on Serial0
network 193.1.1.0, metric 8576 (neighbor 1100)
This command logs more detailed information on each routing
update received and sent.
LOG OUTPUT OF DEBUG IP RIP COMMAND
RIP: sending update to 255.255.255.255 via Serial0
(164.7.1.97)
RIP: Update contains 1 routes
RIP: received update from invalid source 193.1.1.1 on
Ethernet0
RIP: received update from 164.7.1.98 on Serial0
193.1.1.0 in 1 hops
RIP: sending update to 255.255.255.255 via Ethernet0
(164.7.1.66)
subnet 164.7.1.96, metric 1
network 193.1.1.0, metric 2
This log output shows the detail of RIP updates sent and
received, including the interface received on and route
metrics.
Figure 8-12: Control statements and the number of paths they
generate
If the Physical and Data Link layer connections between the
source and destination are good and the routing protocols are
correctly configured but appropriate routes still do not appear in
all the routing tables necessary, the output from these commands
should give a clue as to where the route information is being
dropped. The route information might be dropped because of passive
interfaces, incorrectly configured redistribution, access lists, or
distribute lists.
Assuming the output of the debug commands identifies where in
the chain of routers from source to destination the required routing
information is being dropped, you can examine the configuration of
the suspect router to check for any passive interfaces that will not
send out any routing updates, or those that have a distribute list
applied. Distribute lists are useful if you want to reduce the size
of routing updates sent. A distribute list works with a defined
access list to identify the routes that will be advertised and the
routes that will not. An example configuration of this combination
is given here, allowing updates only from the 164.7.0.0 network
number for the IGRP protocol.
access-list 1 permit 164.7.0.0
router igrp 11
network 164.7.0.0
distribute-list 1 out
If there are no impediments to routing information by passive
interfaces or distribution lists, incorrectly configured
redistribution could be causing route update problems. If you
suspect redistribution, the first thing to check is the default
metric configured for redistributed routes. If this is missing or
using a value that makes subsequent routers discard the route
information, an adjustment in its value is necessary.
Having covered typical problems with Network layer
configuration that result in lost connectivity, what can we do if
two nodes can communicate, but there is a severe performance
problem? We first would look at Physical and Data Link issues, such
as performance of leased lines, framing or line errors, buffers, and
hold queues. If checking these areas fails to cover any problems, we
can examine what is happening at the Network layer to cause slow
performance. The best tool for diagnosing performance problems that
are suspected to being due to Network layer operation is the Cisco
trace command.
An example of the use of trace is given in Fig. 8-13.
Router1>trace ip 164.7.1.97
Type escape
sequence to abort. Tracing
the route to 164.7.1.97 1 164.7.1.66 4 msec 4
msec*
Router1>trace ip 164.7.1.98
Type escape
sequence to abort. Tracing
the route to 164.7.1.98 1 164.7.1.66 4 msec 4
msec 4 msec 2 164.7.1.98 20 msec 20
msec*
Router1>trace ip 164.7.1.99
Type escape
sequence to abort. Tracing
the route to 164.7.1.99
In the first part of the display, we see how trace reports a successful ping from
router 1 to router 3 through router 2 in the lab setup of three
routers we have used throughout the book. The trace reports the path
that is taken from source to destination. The second part of Fig.
8-13 shows an unsuccessful trace. Here, the trace command reports that the router can
find the subnet where the destination IP address should be located,
but it cannot find the host on the subnet, so it queries all devices
it knows about on this subnet to see if they know about the target
address. This display will continue indefinitely until it is stopped
by the break sequence (pressing the Ctrl and 6 keys
simultaneously).
With internetworks that are either poorly designed, or which
are experiencing a fault condition, the paths selected for routes
can become suboptimal, leading to poor performance. A typical
example of this is if a bridge or repeater is connected through the
internetwork to two interfaces on a router. This will lead to a
router receiving duplicate routing updates—effectively receiving the
same updates on two interfaces. This is a condition that a router
cannot deal with effectively, and unpredictable routing decisions
result. The trace command
output can identify these types of problems by reporting the path
packets take from source to destination. If a suboptimal path is
taken, the interconnections made on the internetwork need to be
examined to resolve any conditions that the routing protocols cannot
handle.
Overview of OSPF Troubleshooting.
OSPF and other link
state protocols are more complex to troubleshoot than distance
vector protocols. The problem we will consider initially is that of
a ping packet that is not returned successfully from a remote host.
Assuming that the Physical and Data Link layers have checked out and
that all devices have OSPF enabled for the same autonomous system
number, troubleshooting an OSPF internetwork starts off in the same
way as for an IGRP internetwork. The first task is to review the
routing table entries, because each routing device from source to
destination must have routing table entries that enable a packet to
be routed in both directions for a ping request to be
successful.
Assuming that the ping fails because of a lack of routing
table entries, we would look first for any passive interfaces or
distribute lists stopping the route information from being
disseminated. If none exists, the OSPF configuration for each router
device must be reviewed. All interfaces that are to participate in
OSPF routing need to have the network numbers to which they belong
listed in the network commands that are entered as subcommands under
the OSPF major command.
You can check whether OSPF is running on all expected
interfaces by issuing the show ip ospf interface command for each interface. (A
sample display was illustrated in Fig. 4-17.) Obviously, any interface that is
not reporting OSPF information for this command is incorrectly
configured and needs to be investigated.
Another useful command for troubleshooting at this level is
the show ip
ospf neighbor command,
illustrated in Fig. 8-14, which identifies all the neighbors the
router knows about via the OSPF protocol. If a router that you
expect to see on this list does not show up, further investigation
into the missing router's configuration is required.
The ID is the router ID of the OSPF neighbor, Pri is the
priority of this router that affects it being chosen as a designated
router, Address is the source address of the interface that
advertised this router, which was received through the interface
listed.
Figure 8-14: Output of the show IP OSPF neighbor command
If all the routers appear in the show ip ospf neighbor and all interfaces are enabled for
OSPF, but you still cannot ping the desired host, check to make sure
that each OSPF area has at least one border router and that border
router is connected to area 0. The only way to check this is by
viewing the configuration of the border router. This is an important
configuration requirement for OSPF internetworks, as all interarea
communication has to go through area 0.
The last consideration is that of mismatched hello and dead
timers. These timers can be viewed in the show ip ospf interface display. The value for these timers
should be the same for all interfaces. If mismatched values are
found, they can be altered in interface configuration mode by the
ip ospf
dead-interval and
ip ospf
hello-interval
commands.
Troubleshooting Packet-Oriented WAN
Protocols
This section will provide the essential information for
initial troubleshooting of the packet-oriented WAN protocols, frame
relay and X.25. The focus of this section is to explore why nodes
might not be communicating for each of the protocols considered.
Typically, there is more in the router configuration of an X.25
connection that can cause intermittent or poor performance than in a
frame relay connection.
A frame relay connection is normally configured for
connection to a public network, and performance issues with this
type of connection are generally linked to the public network
itself, or to Physical layer issues such as noisy lines or router
buffer problems. Troubleshooting noisy lines and router buffer
problems for frame relay follow the same process as described
previously. We shall therefore look mainly at what can cause
connectivity to fail in a frame relay environment. With X.25, we
will look at router issues that can contribute to poor performance
as well as to no connectivity.
Troubleshooting Lack of Connectivity over Frame
Relay. The most usual configuration for connecting a
router to a public frame relay network is for the public frame relay
network to send data
link connection identifier (DLCI) information to the router via an agreed LMI interface
type. Again assuming that everything at the Physical and Data Link
layers are working and that the show interface serial command reports an up condition for
both interface and line protocol status, we will first want to see
if LMI information is being received and sent.
The place to start is with the show frame-relay map command, as shown in Fig. 6-8, which will tell you if the router
has successfully learned of the remote device protocol IDs. If this
process has failed, there will be no entries in the display of this
command. The process that should take place is that the local management
interface (LMI)
informs the router of the available DLCI numbers and the router uses
inverse ARP to determine the protocol address of the devices at the
other end of the PVCs identified by the DLCIs.
If a router is not registering the available DLCIs on its
frame relay connection (which can be determined by issuing the
show
frame-relay PVC command
as illustrated in Fig. 6-8), you should determine whether the
switch is sending the information via the LMI. In this situation,
the debug
frame-relay lmi command
should be used. If the frame relay switch is sending DLCI numbers
via the LMI, they will be listed in the output of this command. If
this debug command lists DLCI numbers being sent by the frame relay
switch that are not shown in the show frame relay map command, the LMI type used by the
router should be confirmed as correct. If no DLCI numbers are
listed, you need to contact the company supplying the frame relay
connection to have the frame relay switch send the correct
data.
This covers LMI operation; checking whether Inverse ARP
worked is a little more tricky. If the router learns about its DLCI
numbers, but does not establish entries in its show frame-relay
map command output,
there are two possibilities. Either the two nodes communicating over
the frame relay network are not configured to send broadcast routing
updates, or the frame relay network is improperly configured.
Chapter 6 covered setup of broadcast IGRP
updates over frame relay links, and I will not repeat that here. If
you find a situation in which the router knows of its DLCI numbers,
and you are sure that all connected devices are set up for
broadcast, and that IGRP, or some other appropriate routing
protocol, is properly enabled on the connected devices, you have to
inform the frame relay provider that Inverse ARP is not working over
the network and seek help. In the meantime, static maps can be
entered into the router with the frame-relay map command.
Troubleshooting Lack of Connectivity over
X.25. There are many similarities between a frame relay
link and an X.25 link, such as the use of packet switching, PVC and
SVC allocation, and support for multiple logical connections being
established over a single physical connection. The differences
between the two technologies are significant enough, however, to
justify different troubleshooting procedures.
In frame relay, the DLCI number provides the key to
delivering traffic on a frame relay connection. In X.25, the X.121
address is the key addressing element. A DLCI and an X.121 address
are very different things. The DLCI has only local significance, so
the same DLCI number can be assigned at both ends of the link and it
will still work. The X.121 address, by contrast, has significance
throughout the X.25 network and is an address that can be used to
identify a single host on the network. Also, X.25 does not have the
same Inverse ARP capabilities, so the router's configuration must be
filled with all the necessary X.25 map statements to map IP to X.25
addresses.
Having established that X.25 is a very different type of
packet switching technology to frame relay, let's consider how to
resolve X.25 issues that can result in no connectivity across a
network, then those that result in poor connectivity.
The case that we will use for this discussion is of two IP
networks interconnected via an X.25 network, which is similar to the
configuration used in the discussion on configuring X.25 interfaces
in Chap. 6.
Assuming that all the Physical layer issues having to do with
leased lines and cables are operational but connectivity across the
X.25 network is still not available, we need consider what at the
X.25 level can stop communication. The first thing to do is
determine that the X.121 addresses are correct, both for the address
assigned to your X.25 interface and those used to address remote
hosts in the X.25 map
statements.
Next, check whether routing updates are getting from and to
the remote locations. To do this, you must view the configuration of
the routers and make sure that all the X.25 map statements include the keyword
broadcast. This
keyword ensures that IGRP or other routing protocol updates are
transported over the X.25 network to all remote locations defined in
the X.25 map
statements.
The remaining issues that we shall consider can degrade
performance, or in severe cases, deny connectivity altogether over
an X.25 network.
The show
interface serial
command output for an interface with X.25 encapsulation (as
illustrated in Chap. 6), lists frame reject (REJ), Receiver Not Ready (RNR), Frame Error (FRMR), line
disconnects (DISC)
and protocol
restart (RESTART)
values that should all be low, by which I mean less than 0.5 percent
of the number of information frames (IFRAME). If any of these values
is greater than this 0.5 percent number, there is a problem either
at the Physical level with the hardware and leased lines, or there
is a configuration mismatch. With X.25 connections, you need to be
concerned about matching the configuration of many variables for the
connected devices at both the LAPB and X.25 level to ensure optimum
communication. Table 8.2 lists the ID of the variable as reported in
show interface
serial and show x25 vc commands, a description of this
variable, and the configuration command to change the
variable.
Table 8.2: LAPB and X.25 Configuration
Variables:
ID of the Variable
Description
Configuration Command
LAPB T1
Retransmission timer, or how long the router will wait for an
acknowledgment before polling for a response
lapb t1 (value in milliseconds)
LAPB N1
Maximum bits per frame
lapb n1 (no. of bits)
LAPB N2
Number of retransmit attempts allowed before the link is
declared down
lapb n2 (no. of tries)
LAPB k
LAPB window size, the maximum number of frames that can be
transmitted before an acknowledgment is required
lapb k (number)
LAPB modulo
Frame numbering scheme, the maximum window size is the modulo
less 1
lapb modulo (8 or 128)
channels: incoming, two-way, outgoing
The lowest and highest permissible incoming, outgoing, and
two-way X.25 logical channel numbers
x25 lic, hic, ltc, htc, loc, hoc
x25 modulo
The packet sequence numbering scheme
x25 modulo (8 or 128)
window size input, output
The window size configured for X.25 inbound and outbound
packets
x25 win, wout
packet size input, output
The maximum X.25 packet size
x25 ips, ops
x25 timers
and clear timers
T10–13 for a DCE and T20–23 for DTE, set the restart, call,
reset,
x25 t10 t11, t12, t13, t20, t21, t22, t23
If you can verify cabling, hardware, and leased-line
operation; have appropriate addresses and X.25 DTE/DCE
configuration; and can verify that all of the variables listed in
Table 8.1 are compatible between the two communication devices,
there should be no reason to stop communication. If problems still
exist, the last resort is to connect a serial line analyzer and see
if one end is sending the SABM initialization sequence and the other
is responding with UA frames. If all the troubleshooting activities
listed here check out okay (i.e., that everything is functioning as
it should), you should contact the X.25 network vendor and ask for
assistance in resolving any other issues.
The outputs of the debuglapb and debug x25
commands provide extensive and in-depth analysis of the
communication between X.25-connected devices. If you are going to
spend considerable amounts of time with LAPB and X.25 communication
problems, it is worth referring to the Cisco documentation or
talking to a Cisco Systems engineer.